Evapotranspiration EDA

EAS 520

Ronny Hernández Mora

2/27/2021

1 Workflow

Using eddy covariance open-data from Enviro-Net platform and the R programming language to estimate real evapotranspiration in the tropical dry forest

2 Import data

We are going to import three datasets that comes from the Enviro-Net open-data platform. This data sets are:

  • carbon_tower_streams: Data from the Carbon Tower that it’s from the streams sensor station
  • carbon_tower_eddy: Data from the Carbon Tower that it’s from the Eddy Co. sensor station
  • principe_tower_eddy: Data from the Principe Tower that it’s from the Eddy Co. sensor station
carbon_tower_streams <- read_csv("data/COSTARICA-SantaRosaNationalPark,Guanacaste_CarbonTower(New)_Streams_Streams(12345)_20181228_20190128.csv") %>% 
  clean_names()

carbon_tower_eddy <- read_csv("data/COSTARICA-SantaRosaNationalPark,Guanacaste_CarbonTower(New)_EddyCo._CampbellCR3000(1)_20130607_20190512.csv") %>% 
  clean_names()

principe_tower_eddy <- read_csv("data/COSTARICA-SantaRosaNationalPark,Guanacaste_PrincipeTower(New)_EddyCo._CampbellCR3000(2)_20151213_20190622.csv") %>% 
  clean_names()

Notes on datasets

  • All datapoints are on 30 min intervals

3 carbon_tower_streams EDA

In order to understand the structure of the data set please check the DICTIONARY file

Note Even if you select a date range from 2013 to present in the enviro-net portal, there are just two months available to retrieve.

3.1 Evapotranspiration EDA from streams

First we check the months and years available in the data set:

carbon_tower_streams %>% 
  mutate(date = as_date(date_time)) %>% 
  group_by(year(date), month(date)) %>% 
  tally() %>% 
  ungroup() %>% 
  rename("Year" = `year(date)`,
         "Month" = `month(date)`,
         "Total observations" = "n") %>% 
  gt() %>% 
  tab_header(
    title = md("**Total observations per date**"),
    subtitle = ("For the carbon tower data streams")
  ) 
Total observations per date
For the carbon tower data streams
Year Month Total observations
2018 12 192
2019 1 1308


The date range we have available is 2018-12-28 00:00:19, 2019-01-28 05:30:19, that’s just one month.

# Evapotranspiration
carbon_tower_streams %>% 
  mutate(date = ymd_hms(date_time)) %>%
  ggplot(aes(x = date_time, y = evapotranspiration)) +
  geom_point() +
  # Si tenemos datetime, cambiamos el scale_x
  scale_x_datetime(date_labels = "%d-%b-%Y",  date_breaks = "1 day") +
  theme_light(base_size = 16) +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(title = "Evapotranspiration from the Carbon tower in Santa Rosa",
       subtitle = "Data from the streams sensor station that includes the
       evapotranspiration calculation",
       x = "Date", y = "Evapotranspiration kg m-2 s-1")

# day <- interval(hms("06:00:00"), hms("17:30:00"))
# carbon_tower_streams %>%
#   mutate(day = ifelse(hms(date_time) > hms("06:00:00") & 
#                         hms(date_time) > hms("17:00:00"), 
#                       "dia", "noche")) %>%
#   select(date_time, day) %>%  View()
carbon_tower_streams %>% 
  separate(col = date_time, into = c("date", "time"), sep = " ") %>% 
  mutate(time = hms(time),
         day = ifelse(hms(time) > hms("05:20:00") & 
                        hms(time) < hms("17:20:00"),
                      "day", "nigth")) %>%
  replace_na(list(day = "nigth")) %>% 
  ggplot(aes(x = date, y = evapotranspiration, colour = day)) +
  geom_jitter(alpha = 0.5, size = 2) +
  theme_light(base_size = 12) +
  theme(axis.text.x = element_text(angle = 75, h = 1)) +
  scale_y_continuous(breaks = seq(-0.2, 0.2, by = 0.02)) +
  labs(title = "Evapotranspiration carbon tower Santa Rosa",
       subtitle = "Day defined as the interval of date time from 05:20:00 to 17:20:00",
       x = "Date", 
       y = "Evapotranspiration kg m-2 s-1",
       fill = "Day period")

There are some values with high latent_heat_flux and thus high evapotranspiration values in the night. Is this normal or is due to an error?

I defined day as those observations between the date_time > 05:20:00 and date_time 17:20:00 in order to check this values

# Check nigth values with evapotranspiration close to day values
## above 0.04 we can try to explore those values in the nigth category
carbon_tower_streams %>% 
  separate(col = date_time, into = c("date", "time"), sep = " ") %>% 
  mutate(time = hms(time),
         day = ifelse(hms(time) > hms("05:20:00") & 
                        hms(time) < hms("17:20:00"),
                      "day", "nigth")) %>%
  # This replace is due to the 00:00 time format
  replace_na(list(day = "nigth")) %>% 
  filter(evapotranspiration > 0.04) %>% 
  DT::datatable(
    rownames = FALSE,
    extensions = 'Buttons',
    selection = list(mode = 'single', target = 'row'),
    options = list(
      lengthMenu = list(c(20, 50, 100, -1), c('20', '50', '100', 'All')),
      dom = 'Bflp',
      buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
      scrollX = T)) 
# Revisión de datos de evapotranspiracion y latent heat
check <- carbon_tower_streams %>% 
  separate(col = date_time, into = c("date", "time"), sep = " ") %>% 
  mutate(time = hms(time),
         day = ifelse(hms(time) > hms("05:20:00") & 
                        hms(time) < hms("17:20:00"),
                      "day", "nigth")) %>%
  replace_na(list(day = "nigth")) %>% 
  mutate(eva_plus = ifelse(evapotranspiration > 0.04, "high", "low"))

# ¿Qué características tienen los puntos de noche con eva > 0.04?
check %>% 
  select(time, day, evapotranspiration, latent_heat_flux) %>% 
  filter(evapotranspiration > 0.04, day == "nigth") %>% 
  arrange(desc(evapotranspiration)) %>% 
  head(15) %>% 
  gt() %>% 
  tab_header(
    title = "High  evapotranspiration values during nigth",
    subtitle = "For the carbon tower data streams"
  )
High evapotranspiration values during nigth
For the carbon tower data streams
time day evapotranspiration latent_heat_flux
18H 0M 19S nigth 0.09996834 99.96835
22H 0M 19S nigth 0.08141080 81.41080
19S nigth 0.07540219 75.40219
21H 0M 19S nigth 0.07431184 74.31184
1H 0M 19S nigth 0.06476972 64.76971
18H 30M 19S nigth 0.06440952 64.40952
3H 0M 19S nigth 0.06401478 64.01479
21H 30M 19S nigth 0.05914803 59.14803
3H 0M 19S nigth 0.05582307 55.82307
1H 30M 19S nigth 0.05388340 53.88340
18H 0M 19S nigth 0.05166006 51.66006
19S nigth 0.05098506 50.98506
3H 30M 19S nigth 0.04974208 49.74208
23H 30M 19S nigth 0.04831424 48.31424
4H 0M 19S nigth 0.04739451 47.39450
# Notar que el latent heat flux es la multiplicacion de
# evapotranspiracion * mil

# check %>% 
#   # filter(eva_plus == "high") %>% 
#   ggplot(aes(x = evapotranspiration, y = latent_heat_flux, colour = day)) +
#   geom_jitter(alpha = 0.3, size = 2) +
#   theme_light()

3.2 Exploration of other variables behaviour trough dates availables

This are some plots to explore and understand the behaviour of other variables in the dataset carbon_tower_streams trough time.

# Water vapor mass density carbon
carbon_tower_streams %>% 
  mutate(date = ymd_hms(date_time)) %>%
  ggplot(aes(x = date_time, y = ambient_water_vapor_mass_density)) +
  geom_line() +
  # Si tenemos datetime, cambiamos el scale_x
  scale_x_datetime(date_labels = "%d-%b-%Y",  date_breaks = "1 day") +
  theme_light(base_size = 16) +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(title = "Water vapor mass density carbon tower Santa Rosa",
       x = "Date", y = "Water vapor mass density kg m-3")

# Latent heat flux
carbon_tower_streams %>% 
  mutate(date = ymd_hms(date_time)) %>%
  ggplot(aes(x = date_time, y = latent_heat_flux)) +
  geom_line() +
  # Si tenemos datetime, cambiamos el scale_x
  scale_x_datetime(date_labels = "%d-%b-%Y",  date_breaks = "1 day") +
  theme_light(base_size = 16) +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(title = "Latent heat flux carbon tower Santa Rosa",
       x = "Date", y = "Latent heat flux W m-2")

# Net ecosystem exchange
carbon_tower_streams %>% 
  mutate(date = ymd_hms(date_time)) %>%
  ggplot(aes(x = date_time, y = net_ecosystem_exchange_mmol_m_2_s_1 )) +
  geom_line() +
  # Si tenemos datetime, cambiamos el scale_x
  scale_x_datetime(date_labels = "%d-%b-%Y",  date_breaks = "1 day") +
  theme_light(base_size = 16) +
  theme(axis.text.x = element_text(angle = 90)) +
  labs(title = "Net ecosystem exchange carbon tower Santa Rosa",
       x = "Date", y = "Net ecosystem exchange mmol m2 s1")

3.3 Relations between variables

Just to understand evapotranspiration behaviour against other variables

carbon_tower_streams %>% 
  mutate(date = ymd_hms(date_time)) %>%
  ggplot(aes(y = evapotranspiration, 
             x = net_ecosystem_exchange_mmol_m_2_s_1 )) +
  geom_point() +
  geom_smooth(method = "lm") +
  theme_light(base_size = 16)

carbon_tower_streams %>% 
  mutate(date = ymd_hms(date_time)) %>%
  ggplot(aes(y = evapotranspiration, 
             x = latent_heat_flux)) +
  geom_point() +
  geom_smooth(method = "lm") +
  theme_light(base_size = 16)

carbon_tower_streams %>% 
  mutate(date = ymd_hms(date_time)) %>%
  ggplot(aes(y = evapotranspiration, 
             x = ambient_water_vapor_mass_density)) +
  geom_point() +
  geom_smooth(method = "lm") +
  theme_light(base_size = 16)

4 carbon_tower_eddy EDA

In order to understand the structure of the data set please check the DICTIONARY file

Note

# Select variables to play with
eddy <- carbon_tower_eddy %>% 
  select(date_time, sensible_heat_flux, co2_flux, latent_heat_flux,
         mean_co2_concentration, mean_h2o_vapour_concentration, 
         mean_moist_air_density, mean_thermocouple_temp)

# Check available dates in the data set
eddy %>% 
  group_by(year(date_time), month(date_time, 
                                  label = TRUE,
                                  locale = Sys.getlocale("LC_CTYPE"))) %>% 
  tally() %>% 
  rename(
    Year = `year(date_time)`,
    Month = `month(date_time, label = TRUE, locale = Sys.getlocale("LC_CTYPE"))`,
    Observations = "n"
  ) %>% 
  gt() %>% 
  tab_header(
    title = md("**Total of observations per year and month**"),
    subtitle = ("For the carbon tower eddy data")
  ) %>% 
  fmt_number(columns = vars(Observations),
             decimals = 0,
             use_seps = TRUE) %>% 
  data_color(
    columns = vars(Month),
    colors = scales::col_factor("Spectral", domain = NULL)
  )
Total of observations per year and month
For the carbon tower eddy data
Month Observations
2013
Jun 1,125
Jul 1,130
Aug 1,459
Sep 1,441
Oct 1,301
Nov 1,444
Dec 1,492
2014
Jan 1,034
Feb 1,079
Mar 333
Apr 900
May 1,163
Jun 1,443
Jul 1,492
Aug 1,492
Sep 1,443
Oct 1,491
Nov 1,443
Dec 594
2015
Jan 1,323
Feb 1,124
Mar 884
Apr 1,444
May 1,428
Jun 1,439
Jul 1,491
Aug 1,369
Sep 1,443
Oct 1,493
Nov 1,443
Dec 1,381
2016
Jan 1,491
Feb 515
Mar 1,492
Apr 1,444
May 807
Jun 274
Jul 1,491
Aug 1,491
Sep 1,443
Oct 444
Nov 1,444
Dec 1,492
2017
Jan 1,492
Feb 1,344
Mar 1,491
Apr 1,444
May 852
Jun 155
Jul 1,197
Sep 1,673
Oct 3,212
Nov 933
2019
Mar 403
Apr 2,085
May 1,092
# Por mes agregado por año
eddy %>% 
  group_by(year(date_time), month(date_time)) %>% 
  tally() %>% 
  ggplot(aes(x = as.factor(`month(date_time)`),
             y = n, 
             fill = as.factor(`year(date_time)`))) +
  geom_bar(stat = "identity") +
  scale_fill_viridis_d() +
  labs(x = "Month", y = "Total observations", fill = "Year") +
  theme_light(base_size = 16)

# Separados por mes y por año
eddy %>% 
  group_by(zoo::as.yearmon(date_time)) %>% 
  tally() %>% 
  rename("date" = `zoo::as.yearmon(date_time)`, "total" = "n") %>% 
  ggplot(aes(x = as.factor(date),
             y = total, 
             fill = as.factor(year(date)))) +
  geom_bar(stat = "identity") +
  scale_fill_viridis_d() +
  labs(x = "Month", y = "Total observations", fill = "Year") +
  theme_light(base_size = 21) +
  theme(axis.text.x = element_text(angle = 70, h = 1))

4.1 Evapotranspiracion calculation with bigleafR

We can obtain evapotranspiration with the bigleafR function, in which we can use Latent Heat Flux (W m-2) with Air Temperature (C) to obtain evapotranspiracion (kg m-2 s-1)

evapotranspiration <- eddy %>% 
  mutate(evapotranspiration_kg = LE.to.ET(latent_heat_flux,
                                          mean_thermocouple_temp)) %>% 
  # Transform units
  mutate(evapotranspiration_mol = kg.to.mol((evapotranspiration_kg) * 1000))

With the dataset carbon_tomer_streams we saw that there is a strong correlation between latent_heat_flux and evapotranspiration and it’s basically latent_heat_flux * 1000. I want to check if this is the same with the results obtained using the bigleafR package:

evapotranspiration %>% 
  select(latent_heat_flux, evapotranspiration_kg) %>% 
  head(20) %>% 
  gt() %>% 
  tab_header(
    title = md("**Comparison of Latent Heat Flux with Evapotranspiration**"),
    subtitle = "For the Carbon Tower eddy data"
  ) %>% 
  tab_footnote(
     footnote = "Shows only the first 20 observations",
     locations = cells_column_labels(
       columns = vars(latent_heat_flux, evapotranspiration_kg)
     )
  )
Comparison of Latent Heat Flux with Evapotranspiration
For the Carbon Tower eddy data
latent_heat_flux1 evapotranspiration_kg1
193.494202 0.00007960714624
230.060394 0.00009469096193
108.607399 0.00004469616681
34.383789 0.00001411770145
16.015671 0.00000657103328
21.067499 0.00000863878593
4.983779 0.00000204267645
1.355310 0.00000055534245
-1.621751 -0.00000066433338
2.481831 0.00000101648878
1.199968 0.00000049142718
-3.899054 -0.00000159664364
2.369925 0.00000097043228
2.766600 0.00000113293310
-10.047530 -0.00000411504511
6.833600 0.00000279878946
-25.120621 -0.00001029220326
0.162703 0.00000006666249
46.217941 0.00001892208373
-71.067757 -0.00002909555699

1 Shows only the first 20 observations

4.2 Exploratory visualizations from eddy

  • Plot with all the evapotranspiration data points from 2013 to 2019
  • There is no data for 2018 and only 3 months for 2019
evapotranspiration %>% 
  mutate(date = ymd_hms(date_time)) %>%
  # filter values above 0.05
  filter(evapotranspiration_mol < 1000 & evapotranspiration_mol > -500) %>%
  # filter(year(date_time) == 2013) %>% 
  ggplot(aes(x = date_time, y = evapotranspiration_mol)) +
  geom_point(alpha = 0.5) +
  scale_x_datetime(date_labels = "%b", breaks = "months") +
  theme_light(base_size = 21) +
  theme(axis.text.x = element_text(angle = 75, h = 1)) +
  labs(title = "Evapotranspiration carbon tower Santa Rosa",
       subtitle = "Al the datapoints available",
       x = "Date", y = "Evapotranspiration mmol m-2 s-1")

plot_evapotranspiration <- function(data, year) {
  data %>%
    mutate(date = ymd_hms(date_time),
           year = year(date_time)) %>%
    # filter values above 0.05
    # filter(evapotranspiration < 0.02 & evapotranspiration > -0.02) %>%
    filter(year == !!year) %>%
    ggplot(aes(x = date_time, y = evapotranspiration_mol)) +
    geom_point(alpha = 0.5) +
    # geom_line() +
    scale_x_datetime(date_labels = "%b", breaks = "months") +
    theme_light(base_size = 16) +
    theme(axis.text.x = element_text(angle = 75, h = 1)) +
    labs(title = "Evapotranspiration carbon tower Santa Rosa",
         subtitle = paste("With all the datapoints available for", year),
         x = "Date", y = "Evapotranspiration mmol m-2 s-1")
}

## Get years
years <- evapotranspiration %>% 
  transmute(year = year(date_time)) %>% 
  distinct() %>% 
  pull()

# Map to get all plots
map(.x = years, .f = function(plots) {
  evapotranspiration %>% 
    plot_evapotranspiration(year = plots)
})
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

5 principe_tower_eddy EDA

In order to understand the structure of the data set please check the DICTIONARY file

Note The data available in enviro-net from this tower does not contains stream data. But it does have the variable air_temperature

# Select variables to play with
eddy <- principe_tower_eddy %>% 
  select(date_time, sensible_heat_flux, co2_flux, latent_heat_flux,
         mean_co2_concentration, mean_h2o_vapour_concentration, 
         mean_moist_air_density, mean_thermocouple_temp,
         air_temperature, humidity, saturation_vapour_pressure_k_pa)

# Check available dates in the data set
eddy %>% 
  group_by(year(date_time), month(date_time, 
                                  label = TRUE,
                                  locale = Sys.getlocale("LC_CTYPE"))) %>% 
  tally() %>% 
  rename(
    Year = `year(date_time)`,
    Month = `month(date_time, label = TRUE, locale = Sys.getlocale("LC_CTYPE"))`,
    Observations = "n"
  ) %>% 
  gt() %>% 
  tab_header(
    title = md("**Total of observations per year and month**"),
    subtitle = ("For the Principe tower eddy data")
  ) %>% 
  fmt_number(columns = vars(Observations),
             decimals = 0,
             use_seps = TRUE) %>% 
  data_color(
    columns = vars(Month),
    colors = scales::col_factor("Spectral", domain = NULL)
  )
Total of observations per year and month
For the Principe tower eddy data
Month Observations
2015
Dec 904
2016
Jan 448
2017
Jul 697
Aug 1,506
Sep 1,617
Oct 1,562
2018
Jan 1,169
Feb 1,803
Mar 885
May 678
Jul 677
Aug 855
Oct 517
Nov 835
2019
Jan 360
Feb 1,204
Mar 1,494
Apr 1,450
May 829
Jun 1,029
# Por mes agregado por año
eddy %>% 
  group_by(year(date_time), month(date_time)) %>% 
  tally() %>% 
  ggplot(aes(x = as.factor(`month(date_time)`),
             y = n, 
             fill = as.factor(`year(date_time)`))) +
  geom_bar(stat = "identity") +
  scale_fill_viridis_d() +
  labs(x = "Month",
       y = "Total observations",
       fill = "Year") +
  theme_light(base_size = 12)

# Separados por mes y por año
eddy %>% 
  group_by(zoo::as.yearmon(date_time)) %>% 
  tally() %>% 
  rename("date" = `zoo::as.yearmon(date_time)`, "total" = "n") %>% 
  ggplot(aes(x = as.factor(date),
             y = total, 
             fill = as.factor(year(date)))) +
  geom_bar(stat = "identity") +
  scale_fill_viridis_d() +
  theme_light(base_size = 21) +
  labs(x = "Date", y = "Total observations", fill = "Year") +
  theme(axis.text.x = element_text(angle = 70, h = 1))

5.1 Evapotranspiracion calculation with bigleafR

evapotranspiration <- eddy %>% 
  mutate(evapotranspiration_kg = LE.to.ET(latent_heat_flux,
                                       air_temperature)) %>% 
  # Transform units
  mutate(evapotranspiration_mol = kg.to.mol((evapotranspiration_kg) * 1000))
evapotranspiration %>% 
  select(latent_heat_flux, evapotranspiration_kg) %>% 
  head(20) %>% 
  gt() %>% 
  tab_header(
    title = md("**Comparison of Latent Heat Flux with Evapotranspiration**"),
    subtitle = "For the Principe tower eddy data"
  ) %>% 
  tab_footnote(
     footnote = "Shows only the first 20 observations",
     locations = cells_column_labels(
       columns = vars(latent_heat_flux, evapotranspiration_kg)
     )
  )
Comparison of Latent Heat Flux with Evapotranspiration
For the Principe tower eddy data
latent_heat_flux1 evapotranspiration_kg1
27.75327 0.000011361045
20.51730 0.000008398204
25.59195 0.000010475229
65.54984 0.000026847631
75.03687 0.000030745849
91.85464 0.000037640203
100.04420 0.000041005988
121.25880 0.000049701252
142.39890 0.000058389518
134.72060 0.000055268658
160.06419 0.000065684064
164.36459 0.000067486372
169.28529 0.000069531753
189.70660 0.000077929680
133.58501 0.000054883470
260.44470 0.000107044942
190.61430 0.000078374277
235.16400 0.000096675505
147.53140 0.000060628577
175.46950 0.000072108200

1 Shows only the first 20 observations

5.2 Exploratory visualizations from eddy

  • Plot with al the data points of evapotranspiration from 2013 to 2019
  • There is no data for 2018 and only 3 months for 2019
evapotranspiration %>% 
  mutate(date = ymd_hms(date_time)) %>%
  # filter values above 0.05
  # filter(evapotranspiration < 0.02 & evapotranspiration > -0.02) %>% 
  # filter(year(date_time) == 2013) %>% 
  ggplot(aes(x = date_time, y = evapotranspiration_mol)) +
  geom_point(alpha = 0.5) +
  scale_x_datetime(date_labels = "%b", breaks = "months") +
  theme_light(base_size = 16) +
  theme(axis.text.x = element_text(angle = 75, h = 1)) +
  labs(title = "Evapotranspiration Principe tower Santa Rosa",
       subtitle = "Al the datapoints available",
       x = "Date", y = "Evapotranspiration mmol m-2 s-1")

plot_evapotranspiration <- function(data, year) {
  data %>%
    mutate(date = ymd_hms(date_time),
           year = year(date_time)) %>%
    # filter values above 0.05
    # filter(evapotranspiration < 0.02 & evapotranspiration > -0.02) %>%
    filter(year == !!year) %>%
    ggplot(aes(x = date_time, y = evapotranspiration_mol)) +
    geom_point(alpha = 0.5) +
    # geom_line() +
    scale_x_datetime(date_labels = "%b", breaks = "months") +
    theme_light(base_size = 16) +
    theme(axis.text.x = element_text(angle = 75, h = 1)) +
    labs(title = "Evapotranspiration Principe tower Santa Rosa",
         subtitle = paste("With all the datapoints available for", year),
         x = "Date", y = "Evapotranspiration kg m-2 s-1")
}

## Get years
years <- evapotranspiration %>% 
  transmute(year = year(date_time)) %>% 
  distinct() %>% 
  pull()

# Map to get all plots
map(.x = years, .f = function(plots) {
  evapotranspiration %>% 
    plot_evapotranspiration(year = plots)
})
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

6 Evapotranpiration

6.1 Notes on calculating real evapotranspiration

  • It looks that the function LE.to.ET from the bigleaf R package calculates potential evapotranspiration rather than real evapotranspiration. My guess is because we only need Latent Heat Flux and Air Temperature to calculate this, but the problem is that at least in Santa Rosa we have a dry season were there is almost no rain, and if there is no water, there is not going to be evaporation and transpiration.
  • From the REddyProc package, it says that the function from the bigleaf to calculate evapotranspiration, needs to be converted to mmol/m2/s because it’s on Kg/m2/s units. This can be done by the following:
LE <- seq(300, 500, by = 50)
Tair <- 25

ETkg <- LE.to.ET(LE, Tair)
(ETmmol <- kg.to.mol(ETkg) * 1000)
evapotranspiration <- eddy %>% 
  mutate(evapotranspiration_kg = LE.to.ET(latent_heat_flux,
                                       air_temperature)) %>% 
  # Transform units
  mutate(evapotranspiration_mol = kg.to.mol((evapotranspiration_kg) * 1000))
evapotranspiration %>% 
  mutate(date = ymd_hms(date_time)) %>%
  # filter values above 0.05
  # filter(evapotranspiration < 0.02 & evapotranspiration > -0.02) %>% 
  # filter(year(date_time) == 2013) %>% 
  ggplot(aes(x = date_time, y = evapotranspiration_mol)) +
  geom_point(alpha = 0.5) +
  scale_x_datetime(date_labels = "%b", breaks = "months") +
  theme_light(base_size = 21) +
  theme(axis.text.x = element_text(angle = 75, h = 1)) +
  labs(title = "Evapotranspiration Principe tower Santa Rosa",
       subtitle = "Al the datapoints available",
       x = "Date", y = "Evapotranspiration mmol m-2 s-1")

7 Comparison between data sets with error/no error

On the enviro-net platform we have the option to create a filter based on the allowable values for the sensors:

I repeated the selection done before with the only difference that this time was with the option Error Removal selected.

Due to the lack of information from enviro-net of what those ranges are, we can skim through the data and try to validate those ranges

# Read data with error removal
carbon_tower_eddy_error_removal <- read_csv("data/COSTARICA-SantaRosaNationalPark,Guanacaste_CarbonTower(New)_EddyCo._CampbellCR3000(1)_20130607_20190512_error_removal.csv") %>% 
  clean_names()

# skim the data:
skim(carbon_tower_eddy_error_removal)
Table 7.1: Data summary
Name carbon_tower_eddy_error_r…
Number of rows 70762
Number of columns 95
_______________________
Column type frequency:
logical 3
numeric 91
POSIXct 1
________________________
Group variables None

Variable type: logical

skim_variable n_missing complete_rate mean count
sonic_azimuth 70762 0 NaN :
mean_pump_cell_pressure 70762 0 NaN :
mean_ambient_pressure 70762 0 NaN :

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
sensible_heat_flux 7423 0.90 40.74 117.62 -571.35 -27.89 -4.33 70.35 876.34 ▁▇▅▁▁
momentum_flux 7421 0.90 0.56 0.59 0.00 0.07 0.32 0.92 4.68 ▇▂▁▁▁
friction_velocity 1898 0.97 0.59 0.38 0.00 0.25 0.54 0.91 2.09 ▇▆▅▁▁
sonic_temp_std_dev 1900 0.97 0.31 0.19 0.00 0.17 0.26 0.41 2.47 ▇▂▁▁▁
covariance_u_ts 1913 0.97 -0.02 0.21 -1.91 -0.06 0.01 0.08 2.57 ▁▂▇▁▁
covariance_v_ts 1900 0.97 0.05 0.30 -3.79 -0.09 -0.01 0.10 4.90 ▁▁▇▁▁
covariance_w_ts 1898 0.97 0.04 0.11 -0.98 -0.02 0.00 0.06 0.90 ▁▁▇▁▁
horizontal_wind_velocity_std_dev 1902 0.97 1.36 0.74 0.00 0.72 1.28 1.94 3.95 ▇▇▇▂▁
covariance_u_v 1907 0.97 -0.20 0.61 -10.00 -0.44 -0.06 0.09 3.94 ▁▁▁▇▁
covariance_u_w 1898 0.97 -0.21 0.32 -2.71 -0.38 -0.10 0.00 3.52 ▁▅▇▁▁
cross_wind_velocity_std_dev 1898 0.97 1.49 0.81 0.00 0.77 1.40 2.12 5.12 ▇▇▆▁▁
covariance_v_w 1905 0.97 0.38 0.48 -2.32 0.01 0.19 0.67 2.99 ▁▁▇▂▁
vertical_wind_velocity_std_dev 1897 0.97 0.76 0.46 0.00 0.36 0.71 1.14 2.37 ▇▆▆▂▁
mean_csat3_horizontal_wind_speed 1897 0.97 4.06 2.29 0.23 2.10 3.66 5.79 13.62 ▇▇▅▁▁
resultant_mean_csat3_wind_speed 1897 0.97 3.79 2.23 0.02 1.92 3.43 5.48 13.16 ▇▇▅▁▁
sonic_derived_wind_direction 1897 0.97 -35.60 74.42 -180.00 -81.53 -47.33 -15.65 180.00 ▂▇▃▁▁
std_dev_of_csat3_wind_direction 1897 0.97 21.34 9.97 0.00 16.18 18.83 22.75 80.00 ▃▇▁▁▁
compass_true_n_referenced_wind_direction 1897 0.97 116.89 92.68 0.01 53.80 80.00 146.39 359.99 ▇▆▂▁▂
mean_horizontal_wind_velocity 1917 0.97 1.42 2.14 -4.82 -0.16 1.27 2.96 10.51 ▁▇▇▂▁
mean_cross_wind_horizontal_wind_velocity 5999 0.92 -2.01 2.31 -7.00 -3.71 -1.94 -0.20 6.84 ▃▇▇▂▁
mean_vertical_wind_velocity 1897 0.97 0.18 0.14 -0.87 0.07 0.16 0.27 1.44 ▁▂▇▁▁
average_sonic_temp 34411 0.51 26.48 0.99 0.00 25.82 26.59 27.28 28.00 ▁▁▁▁▇
sonic_samples_total 24760 0.65 17109.87 3784.27 0.00 18000.00 18000.00 18000.00 18000.00 ▁▁▁▁▇
no_sonic_head_total 263 1.00 0.00 0.02 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
no_new_sonic_data_total 163 1.00 0.02 2.08 0.00 0.00 0.00 0.00 508.00 ▇▁▁▁▁
csat3_signal_amplitude_low_flag 317 1.00 443.92 2578.67 0.00 0.00 0.00 0.00 18000.00 ▇▁▁▁▁
csat3_signal_amplitude_high_flag 304 1.00 303.15 1915.37 0.00 0.00 0.00 0.00 17940.00 ▇▁▁▁▁
csat3_poor_signal_lock_flag 302 1.00 187.95 1353.64 0.00 0.00 0.00 0.00 15683.00 ▇▁▁▁▁
csat3_delta_temperature_flag 257 1.00 22.61 192.33 0.00 0.00 0.00 0.00 4079.00 ▇▁▁▁▁
aq_sig_f_total 255 1.00 72.51 836.21 0.00 0.00 0.00 0.00 17903.00 ▇▁▁▁▁
sonic_cal_err_f_tot 262 1.00 0.00 0.02 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
co2_flux 7463 0.89 -0.10 0.81 -65.19 -0.22 0.02 0.12 9.21 ▁▁▁▁▇
latent_heat_flux 7436 0.89 50.50 104.14 -5173.04 -0.52 16.13 77.75 2309.99 ▁▁▁▇▁
sensible_heat_flux_including_wpl_adjustment_in_le 7428 0.90 37.50 115.47 -1047.03 -28.07 -4.60 62.37 1082.82 ▁▁▇▁▁
co2_concentration_std_dev 7440 0.89 3.01 5.76 0.00 0.68 1.54 3.40 194.07 ▇▁▁▁▁
covariance_u_co2 7464 0.89 0.21 1.83 -95.67 -0.26 0.04 0.55 36.01 ▁▁▁▇▁
covariance_v_co2 7455 0.89 -0.02 2.01 -53.84 -0.54 0.06 0.49 51.71 ▁▁▇▁▁
covariance_w_co2 7463 0.89 -0.10 0.81 -65.19 -0.22 0.02 0.12 9.21 ▁▁▁▁▇
h2o_std_dev 7427 0.90 0.24 0.77 0.00 0.10 0.19 0.31 134.87 ▇▁▁▁▁
covariance_u_h2o 7450 0.89 -0.05 0.15 -3.88 -0.08 -0.02 0.01 1.93 ▁▁▁▇▁
covariance_v_h2o 7453 0.89 0.06 0.18 -2.16 -0.01 0.02 0.10 2.92 ▁▁▇▁▁
covariance_w_h2o 7437 0.89 0.02 0.04 -1.64 0.00 0.01 0.03 1.51 ▁▁▇▁▁
thermocouple_temp_std_dev 7431 0.89 0.31 0.19 0.00 0.17 0.26 0.40 2.91 ▇▁▁▁▁
covariance_u_tc 7429 0.90 -0.01 0.21 -2.87 -0.05 0.01 0.08 6.71 ▁▇▁▁▁
covariance_v_tc 7429 0.90 0.04 0.29 -4.58 -0.09 -0.01 0.09 5.10 ▁▁▇▁▁
covariance_w_tc 7428 0.90 0.03 0.10 -2.76 -0.02 0.00 0.06 0.96 ▁▁▁▇▁
mean_co2_concentration 17973 0.75 658.87 28.87 618.00 636.41 652.97 677.11 1277.36 ▇▁▁▁▁
mean_h2o_vapour_concentration 7481 0.89 15.40 2.51 2.33 13.70 15.40 17.17 23.00 ▁▁▆▇▁
mean_pump_cell_temp 16264 0.77 26.80 2.01 21.05 25.23 26.56 28.26 31.00 ▁▅▇▅▃
mean_thermocouple_temp 36019 0.49 24.54 0.94 20.79 23.90 24.65 25.31 26.00 ▁▁▅▇▇
mean_moist_air_density 7425 0.90 1.13 0.01 1.04 1.12 1.13 1.14 1.90 ▇▁▁▁▁
mean_diff_pressure 6623 0.91 -3.70 0.72 -7.00 -3.77 -3.48 -3.23 -1.01 ▁▂▇▇▁
mean_td_temp 5997 0.92 17.36 2.70 -0.99 15.67 17.56 19.36 24.73 ▁▁▂▇▃
factor_co2 7428 0.90 1.72 0.02 1.57 1.71 1.72 1.74 1.77 ▁▁▁▇▃
factor_h2o 7425 0.90 0.71 0.01 0.39 0.70 0.71 0.71 0.87 ▁▁▁▇▁
total_no_samples_through_irga 22689 0.68 15108.47 6542.35 0.00 17997.00 18000.00 18000.00 18000.00 ▂▁▁▁▇
no_new_irga_data_total 163 1.00 0.02 2.08 0.00 0.00 0.00 0.00 508.00 ▇▁▁▁▁
total_irga_bad_data_flags 2476 0.97 971.95 4009.34 0.00 0.00 0.00 0.00 18000.00 ▇▁▁▁▁
total_irga_startup_flags 275 1.00 0.07 6.11 0.00 0.00 0.00 0.00 740.00 ▇▁▁▁▁
total_motor_speed_flags 564 0.99 0.00 0.23 0.00 0.00 0.00 0.00 26.00 ▇▁▁▁▁
tec_tmpr_f_tot 274 1.00 0.04 4.01 0.00 0.00 0.00 0.00 505.00 ▇▁▁▁▁
total_source_power_flags 650 0.99 0.00 0.21 0.00 0.00 0.00 0.00 20.00 ▇▁▁▁▁
co2_i_f_tot 255 1.00 23.91 569.21 0.00 0.00 0.00 0.00 18000.00 ▇▁▁▁▁
co2_io_f_tot 359 0.99 2.61 57.75 0.00 0.00 0.00 0.00 3094.00 ▇▁▁▁▁
h2o_i_f_tot 371 0.99 2.14 45.24 0.00 0.00 0.00 0.00 2301.00 ▇▁▁▁▁
h2o_io_f_tot 373 0.99 2.04 41.51 0.00 0.00 0.00 0.00 2065.00 ▇▁▁▁▁
co2_io_var_f_tot 379 0.99 4.30 69.87 0.00 0.00 0.00 0.00 1701.00 ▇▁▁▁▁
h2o_io_var_f_tot 360 0.99 4.26 72.52 0.00 0.00 0.00 0.00 1799.00 ▇▁▁▁▁
co2_sig_strength_flag_total 263 1.00 28.58 606.68 0.00 0.00 0.00 0.00 18000.00 ▇▁▁▁▁
h2o_sig_strength_flag_total 263 1.00 28.03 603.75 0.00 0.00 0.00 0.00 18000.00 ▇▁▁▁▁
total_differential_pressure_flags 3618 0.95 40.23 646.85 0.00 0.00 0.00 0.00 15185.00 ▇▁▁▁▁
mean_co2_signal_strength_flag_total 7266 0.90 0.95 0.04 0.00 0.94 0.95 0.98 1.02 ▁▁▁▁▇
mean_h2o_signal_strength_flag_total 7266 0.90 0.93 0.05 0.00 0.91 0.93 0.97 1.02 ▁▁▁▁▇
average_pump_temp 163 1.00 97.32 76.71 21.59 44.56 58.77 118.76 435.67 ▇▁▁▁▁
average_pump_pressure 7320 0.90 92.05 1.08 81.51 91.74 92.43 92.68 93.00 ▁▁▁▁▇
average_pump_flow_duty_cycle 159 1.00 0.75 0.12 0.00 0.76 0.77 0.77 1.00 ▁▁▁▇▁
average_pump_flow 3001 0.96 7.00 0.01 5.36 7.00 7.00 7.00 7.00 ▁▁▁▁▇
pump_flow_std_dev 193 1.00 0.01 0.05 0.00 0.00 0.00 0.00 2.95 ▇▁▁▁▁
pump_flow_set_point 159 1.00 7.00 0.00 7.00 7.00 7.00 7.00 7.00 ▁▁▇▁▁
valve_temp_average 12062 0.83 26.89 1.99 18.26 25.36 26.70 28.34 31.00 ▁▁▆▇▅
valve_heater_average 171 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ▁▁▇▁▁
intake_heater_average 171 1.00 0.98 0.15 0.00 1.00 1.00 1.00 1.00 ▁▁▁▁▇
average_panel_temp 12687 0.82 26.23 2.26 16.98 24.48 25.93 27.89 31.00 ▁▁▇▇▅
average_battery_voltage 1575 0.98 12.34 0.60 10.04 12.15 12.43 12.75 13.11 ▁▁▁▇▆
average_pack_voltage 26764 0.62 7.73 0.39 6.00 7.43 7.60 8.15 8.85 ▁▁▇▇▁
average_process_time 427 0.99 61462.73 1629.25 54371.52 61631.93 61924.01 62158.65 85084.36 ▁▇▁▁▁
max_process_time 595 0.99 98571.12 23563.23 73689.00 90531.00 94811.00 99149.00 370123.00 ▇▁▁▁▁
buff_depth_avg 183 1.00 0.00 0.05 0.00 0.00 0.00 0.00 3.95 ▇▁▁▁▁
buff_depth_max 201 1.00 0.48 1.96 0.00 0.00 0.00 1.00 67.00 ▇▁▁▁▁
slowsequence_tot 191 1.00 1799.01 17.27 756.00 1800.00 1800.00 1800.00 1801.00 ▁▁▁▁▇
average_co2_ppm 20571 0.71 364.41 15.07 343.00 352.87 361.52 372.63 474.91 ▇▃▁▁▁

Variable type: POSIXct

skim_variable n_missing complete_rate min max median n_unique
date_time 0 1 2013-06-07 14:30:18 2019-05-12 08:00:56 2015-10-03 19:45:37 70762

8 Session info

For this analysis, the setup used was:

sessionInfo()
## R version 4.0.5 (2021-03-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_CR.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=es_CR.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=es_CR.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_CR.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] skimr_2.1.3      rmdformats_1.0.1 purrr_0.3.4      tidyr_1.1.3     
##  [5] gt_0.2.2         lubridate_1.7.10 bigleaf_0.7.1    visdat_0.5.3    
##  [9] janitor_2.1.0    readr_1.4.0      ggplot2_3.3.3    dplyr_1.0.5     
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.6         lattice_0.20-41    zoo_1.8-9          assertthat_0.2.1  
##  [5] digest_0.6.27      utf8_1.2.1         R6_2.5.0           repr_1.1.3        
##  [9] backports_1.2.1    evaluate_0.14      highr_0.8          pillar_1.5.1      
## [13] rlang_0.4.10       rstudioapi_0.13    solartime_0.0.1    jquerylib_0.1.3   
## [17] Matrix_1.3-2       checkmate_2.0.0    rmarkdown_2.7      labeling_0.4.2    
## [21] splines_4.0.5      stringr_1.4.0      munsell_0.5.0      compiler_4.0.5    
## [25] xfun_0.22          pkgconfig_2.0.3    base64enc_0.1-3    mgcv_1.8-33       
## [29] htmltools_0.5.1.1  tidyselect_1.1.0   tibble_3.1.0       bookdown_0.21     
## [33] viridisLite_0.3.0  fansi_0.4.2        crayon_1.4.1       withr_2.4.1       
## [37] commonmark_1.7     grid_4.0.5         nlme_3.1-152       jsonlite_1.7.2    
## [41] gtable_0.3.0       lifecycle_1.0.0    DBI_1.1.1          magrittr_2.0.1    
## [45] scales_1.1.1       cli_2.3.1          stringi_1.5.3      debugme_1.1.0     
## [49] farver_2.1.0       robustbase_0.93-7  snakecase_0.11.0   bslib_0.2.4       
## [53] ellipsis_0.3.1     generics_0.1.0     vctrs_0.3.6        RColorBrewer_1.1-2
## [57] tools_4.0.5        glue_1.4.2         DEoptimR_1.0-8     hms_1.0.0         
## [61] yaml_2.2.1         colorspace_2.0-0   knitr_1.31         sass_0.3.1